Corpus

Column

Corpus Description

The corpus for this musical analysis consists of a collection of albums (see Appendix) that I chose based on my personal interest. I chose albums specifically, as it is the form in which I consume most of my music. As for the corpus, it consists of a variety of genres including rock, pop, post-rock, ambient, shoegaze, neo-psychedelia, hip-hop.

There are also some album’s that can be considered atypical for these categories, for example LONG SEASON by the Fishmans. This album consists of a single 35-minute song and incorporates a variety of genres into one. Another example is REGRET WHEN IT WAS LOST by death’s dynamic shroud which at points blurs the line between surrealism and music. Or F# A# ∞ by Godspeed You! Black Emperor, an album containing multiple monologues throughout the song creating an eerie vibe. I chose to incorporate these albums not only because they are some of my personal favorites but also for their uniqueness, which should hopefully lead to a more interesting analysis. The corpus contains many more such examples, which I will cover further in the analysis. However, most of the corpus consists of more typical genre representative albums, for example the albums made by; Radiohead, No Party For Coa Dong, tricot, Pink Floyd, Polyphia and Panchiko, are more typical to rock. So, all in all, the corpus consists of a combination of typical and atypical albums that I want to use to find out what makes music interesting to me and what draws me to specific albums. This should also lead to an interesting analysis of the distinctions an overlap between genres and individual albums/songs in the corpus.

Appendix (album - artist)

Global Analysis

Row

Description

Analysis over the entire range of songs

The multivariate analysis plot shows a comparison of the liveness, acousticness and the speechness of each individual song. What is interesting about the plot is that there appears to be very little acousticness and speechness. This is relatively odd as many of the songs should contain speech, furthermore, the liveness seems rather low, possibly due to the amount of ambient music, but what is then even more odd is the fact that many of the songs with the highest liveness are ambient leaning songs (this may be due to Spotify’s algorithm not fully comprehending ambient music). The acousticness on the other hand seems all over the place, likely due to the fast differences in genres.

Differences in energy

The energy ranking plot below shows the mean energy levels of all albums in descending order. What is interesting about this plot is that the albums seems to contain nearly the entire range of the energy spectrum, all the way from A I A: Alien Observer at a level of around 0.13 to World is Yours at almost 1, with an overall mean of around 0.6.

Differences in acousticness

The acousticness ranking shows the mean acousticness levels of all albums in descending order, the pattern of which appears to be the complete opposite of the energy ranking, with most ambient leaning albums at the top of the chart and rock leaning albums to the bottom. For the rock to have a low acousticness seems very normal, as many rock albums use electric guitars, basses, synthesizers etc. However, for ambient leaning albums to be ranked so high on acousticness seems strange. Take for instance 新しい日の誕生 (the album with the second highest acousticness), an album that can be classified as ambient/ dreampunk that has almost no acousticness to it, in fact the only part that seems acoustic is the drums that play in the background of some songs, the rest is all clearly non-acoustic. This classification clearly encapsulates the limitations to the Spotify api, namely; Spotify does not give enough information on how the different attributes are calculated and what they mean. So it may be that this outcome is normal if Spotify considers a song a acoustic when one instrument appears to be acoustic, but we cannot be for certain.

Multivariate Analysis

Row

Energy Ranking

Acousticness Ranking

Row

Key’s and Tempo’s

To the right are plots of the key mode and tempo distribution over all songs.

According to the key mode plot, D major is the most common key followed by G major and B minor. What is interesting though, is that all keys that Spotify assigns are represented at least once. There is also a strange pattern in the frequencies; the lowest key modes all hover between 7 and 8 occurrences apart from G# major (at 3 occurrences). I have yet to find the reason behind this, though, it could be a coincidence.

The tempo distribution shows that the songs in the corpus evenly spread across the different BPM’s with a big peak at around 125 BPM and a smaller peak at around 88 BPM. furthermore, the mean tempo of the corpus (show with a red line) is ~120 BPM which also happens to be the BPM that humans are most keen of and is right around the average BPM for songs in general.

Density Distribution of the Tempo

Frequency of Different Key’s

Row

All Genres and Their Frequency

Album & Genre Comparisons

Row

Rock vs Ambient

In the plot below I compared the rock albums “World Is Yours”, “MASS OF THE FERMENTING DREGS”, “T H E”, “醜奴兒” and “In Rainbows” to the ambient albums “A I A: Alien Observer”, “Building a Better World”, “Music Has The Right To Children”, “小圈子” and “新しい日の誕生”. As you might notice, the genres in the appendix do not directly classify some of the albums as rock or ambient. This is due to the way that Spotify classifies genres, you see, Spotify assigns a genre to the artist, not the album. Furthermore, Spotify is also very specific when it comes to genres resulting in a large sum of non matching genres (see the genre distrubution plot). That is why i used the Spotify genres in combination with the genres classified by one of the largest music database: Rate Your Music.

The resulting right plot gives a surprising result; the mean tempo (and it’s standard deviation) of ambient and rock are not far off. While it does appear that ambient is slightly shifted to a lower bpm than rock, they are still very close. Where we can see a difference is in the volume and duration; ambient tracks appear to be longer and at a lower volume than rock.

The left plot tells a different story, it shows that there is quite a bit of variation between rock and ambient when it comes to the different octaves and their timber, which is most noticeable at c01, c02, c05 and c06. This might give a clue as to why the genres appear so similar in the right plot, yet sound and feel so different.

Row

Tempo with Duration & Volume

Timbre Coeffiecients at Different Octaves

Row

Various other comparisons

Classification and Clustering

Row

PCA

As we explore the corpus, more and more variables become noticeable to have a great influence. However, this begs the question which variables the songs in the corpus share and which variables are vastly different. The variables that all songs share could be a great indicator as to what draws me to certain songs/albums, whereas the difference in variable could explain things that I am not really interested in or don’t really matter for me to like the song. To explore these variables and their contribution to my interest I decided to analyse the corpus with a PCA analysis. The PCA plot shows what percentage of the variance is explained by a given variable. For the analysis I decided to use the variables “danceability”, “energy”, “speechiness”, “acousticness”, “instrumentalness”, “liveness” and “valence” of each song. However, as you might notice, the “tempo”, “key” and “mode” variable are not percent. The reason for which is that these explained a combined 99% of the variance, meaning that my music taste does not lean towards a specific key/mode or tempo. And in order to see the variance in the other variables, leaving them out would yield a more interesting analysis.

The scree plot of the PCA shows this contribution per variable and is inline with the “vector plot” in the sense that the the highest contibuter in the scree plot is also the most red and big vector in the “vector plot”. The PCA analysis shows that there are a few variables that are necessary for me to enjoy an album or song, namely, “liveness”, “speechiness”, “dancebility”and the “valance” all need to be in the same range. whereas the “instrumentalness”, “acousticness” and “energy” can somewhat differ (thought, not as much as the tempo, mode or key). This is also inline with the findings of the ambient and rock comparisons, where the “energy”, “acousticness” and “instrumentallness” showed the greatest difference between the two.

Scree Plot of the Different Variables

Vector Plot of the Variables Contribution

Row

Clustering

Next to finding the contribution of the different variables, I wanted to find groups in the corpus that are similar to each other in the hopes of discovering genres. In the rock vs ambient analysis, I already explained how Spotify uses a vast amount of very specific genres, which led me to use another website to find them for certain albums. But this seems a bit redundant, after all, the Spotify’s API give us a lot of variables so why not discover them our-self. To achieve this I used the K-means algorithm together with an analysis of the sum of squares for different amounts of clusters. In the sum of squares plot we can see that after three clusters, the sum of squares does not decrease nearly as fast, which means that 3 clusters are enough to explain the data. In the most right hand plot, these 3 clusters are shown.

The fact that k-means gave 3 clusters was a surprise to me, after all, earlier analysis led me to believe that my music only contained 2 main genres. And although these clusters might not reflect actual genres, it is still interesting to see that there may be more overlapping genres than I expected.

As I cannot show the name of the songs in the k-means plot, i decided to make a new datatable below with the information of which cluster the songs belong to.

Analysis for the Number of Clusters

K-means Clusters

Row

Corpus with Clusters

Row

Clustering Albums with a Tree

Next to clustering the songs, i also thought it would be interesting to visualize the clusters with a tree. However, there are too many songs to make a meaningful plot, which is why I decided to make the tree with the albums instead. For the albums, I took the mean of all variables and plotted the result.

The final tree can be cut at different points, but if we follow the result of k-means and cut the tree at a depth of 3, we get a cluster reminiscent of ambient, a cluster reminiscent of rock and a very big cluster with what seems like a mixture of genres but no specific one. So there seems to be something of a genre that overlaps most of my music taste next to ambient and rock, but what exactly, is difficult to find out and is likely related to the PCA analysis above.

Tree cluster

Individual Songs

Row

Chroma Analysis of selected songs

Below are chroma plots showing interesting patterns in selected songs from my corpus.

Fishmans - LONG SEASON

LONG SEASON is a 35.17 minutes song that contains multiple sections that flow into each other. At the red lines a clear transition between these sections is visible. Although there are more, some sections blend over such a long period that the transitions become hard to annotate (as is visible in the section after the first red line)

F# A# ∞ - The Dead Flag Blues

The Dead Flag Blues another example of a song that is unique in it’s presentation. It sets an ominous stage trough it use of multiple monologue sections. This is further emphasized in the annotated section (between the red lines); a long, slow decending scale followed by an ascending scale over a period of around 400 seconds.

World is Yours - Nan Nan

World is your is the album with the highest average energy level and Nan Nan should be a very good example of that. However, the Chroma shows something slightly different; very fast sections followed by relatively slower section. This likely means that the fast sections and the instruments used are responsible for the high energy levels.

Drowning in the sewer - Hopelessness/ Slowdeath

Hoplessness and Slowdeath are both by the same artist, whose genre is hard to define but lays somewhere between ambient and breakcore. The cool thing about these two Chorma’s is that both genres are clearly visible. The Darker sections are more ambient and slow, whilst the bright sections lean more towards breakcore and are very fast.

Column

LONG SEASON Chroma

The Dead Flag Blues Chroma

Nan Nan Chroma

Hopelessness Chroma

Slowdeath Chroma

Row

Dynamic Time Wrapping

In Underwater DTW I have compared the live version of underwater by elephant gym to their studio performance. The graph shows us that the two versions are very similar to each other by the apparent diagonal line. That the 2 versions are so similar is not strange at all as Underwater is a math-rock song. This means that the song is performed by a band where typically the drummer and/or the bass decides the rhythm and tempo at which the whole band plays. If the rhythm/tempo is slightly off the whole song will ‘fall apart’, so keeping a consistent rhythm and tempo is crucial to a performance. That being said, the 2 songs do still differ, just not in rhythm or tempo. Rather, they differ in timber and sound stage. likely due to how/where they were recorded and things like the effects the instruments used (a.e. guitar pedals) or ones that were added later.

Underwater DTW

Row

Analysis of the Timber and Chroma of 進化 by 猫 シ Corp. and t e l e p a t h テレパシー能力者

The Self-similarity plot shows the timber and chromagram of 進化 by 猫 シ Corp. and t e l e p a t h テレパシー能力者. I chose this song as it was has the highest liveness recorded by Spotify of any song in the corpus (see Plot A on the ‘Global Analysis’ tab). What is interesting about this classification is that this song is not a live recording in the slightest. Then, why? well, the classification is likely due to the apparent outside noise in the song, you see, 進化 (translated = evolution) is an ambient focused song that incorporates sounds like fire and noise from train stations. This likely confuses the Spotify algorithm to where it classifies the song incorrectly as having a high liveness. The Timbre and Chroma similarity matrix sadly don’t really bring forth the liveness, however, they do show the repeating pattern in the song very clearly. The Chroma creates an almost checkerboard like structure, whilst the timbre shows fade in and fade out that the song makes. This fading pattern is especially visible towards the beginning and start of the song.

Self-similarity Timber and Chromagram

Row

The Color Of Fire

“The Color Of Fire” by Boards of Canada is a perfect representation of the surrealistic, creepy and weirdly nostalgic feeling that the album “Music Has the Right to Children” presents. The song has a clear progression, starting at B7, then Eb seventh, Gb seventh, Ab seventh and back to B7 (followed by a fade off), which the Chordogram plotted below shows very clearly.

Chordogram - The Color Of Fire

Row

Fastest vs slowest song

To the right are tempograms of the slowest and fastest song in the corpus given by Spotify’s algorithm. The fastest song is “ハイライト” by MASS OF THE FERMENTING DREGS with a BPM of 206.37 and the slowest song in the corpus is “Where I Found You (One Star)” by Candy Claws with a BPM of 0. Yes, you saw that right. The slowest song (according to the Spotify algorithm) has a BPM of 0. This has to be a mistake, whilst the song definitely has a slower BPM, a BPM of 0 is virtually impossible. Then why?

First, lets look at the tempogram of ハイライト, which clearly shows lines at the 200-210 and 400-420 BPM range, inline with the predictions made by Spotify. We can also see an interesting pattern arise in the tempogram; the tempo is not a straight line throughout the song. This is likely due to the song being played by a band, which might not always play at perfect tempo (compared to a song made on the computer).

In comparison to the fastest plot, the slowest plot seems all over the place. There are some instances where we can see the BPM, but only for a short period. This is probably the reason why Spotify’s algorithm failed.

Tempogram’s

Fastest Song
Slowest Song

Findings

Column

Concluding the Analysis of my Personal Favorite Albums

Over the few weeks that we had to working on our analysis, I discovered many things about my own music taste, mainly, what peaks my interest/disinterest in different albums and songs. The global analysis really showed how, even though there is a vast difference in the energy, valance, tempo and keys between the songs in the corpus. There is also a great amount of overlap in the liveness and more importantly, the genres. Which led me to analyzing the differences between the 2 biggest/most common genres in my corpus. This showed me that there might be more overlap between the songs I like than first meets the eye. Rock and ambient share more variables than one might think which explains my liking towards specific albums within those genres.

In the clustering analysis I sought to further explain the different genres and the impact of the different Spotify genres. Here, I discovered how thing like the tempo, key, instrumentalness and acousticness differ quite a bit between songs and albums and are of less importance to my music taste. And how the energy, valance, liveness, speechiness and danceability are fairly consistent throughout the corpus meaning that they are of greater importance. Furthermore, the cluster analysis showed me, that to my surprise, there where more than 2 groups that explained my corpus. Where I first thought that my main interests lay between ambient and rock, there was now a third group. Not only that but ambient and rock might not even be the best descriptors of the other groups.

Next to all the analyzing I did with regards to the entire selection of songs. I also analyzed individual songs based on, what seemed to me, the most interesting, or where outliers to certain variables. Giving me deeper insight into the structure of different songs in the corpus. Like the tempo, rhythm, timber and the layout (repeating patterns etc.)

To conclude, I discovered that my preference for music lays, not in the tempo or the keys or the instrumentalness, but rather in the valance, liveness and speechiness. But also, as demostrated in the individual song tab, in weird/unusual patterns hinting to slightly ambient and dreamlike music. And to contrast that, There is also a portion that completely opposes that notion and shows strict recurring patterns/tempos.